oss project
Patterns in the Transition From Founder-Leadership to Community Governance of Open Source
Noori, Mobina, Chakraborti, Mahasweta, Zhang, Amy X, Frey, Seth
Open digital public infrastructure needs community management to ensure accountability, sustainability, and robustness. Yet open-source projects often rely on centralized decision-making, and the determinants of successful community management remain unclear. We analyze 637 GitHub repositories to trace transitions from founder-led to shared governance. Specifically, we document trajectories to community governance by extracting institutional roles, actions, and deontic cues from version-controlled project constitutions (GOVERNANCE.md). With a semantic parsing pipeline, we cluster elements into broader role and action types. We find roles and actions grow, and regulation becomes more balanced, reflecting increases in governance scope and differentiation over time. Rather than shifting tone, communities grow by layering and refining responsibilities. As transitions to community management mature, projects increasingly regulate ecosystem-level relationships and add definition to project oversight roles. Overall, this work offers a scalable pipeline for tracking the growth and development of community governance regimes from open-source software's familiar default of founder-ownership.
- North America > United States > California > Yolo County > Davis (0.14)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)
- (12 more...)
- Law (1.00)
- Government (1.00)
- Information Technology (0.93)
Teaching Code Refactoring Using LLMs
Khairnar, Anshul, Rajoju, Aarya, Gehringer, Edward F.
--This Innovative Practice full paper explores how Large Language Models (LLMs) can enhance the teaching of code refactoring in software engineering courses through real-time, context-aware feedback. Refactoring improves code quality but is difficult to teach, especially with complex, real-world codebases. Traditional methods like code reviews and static analysis tools offer limited, inconsistent feedback. Our approach integrates LLM-assisted refactoring into a course project using structured prompts to help students identify and address code smells such as long methods and low cohesion. Implemented in Spring 2025 in a long-lived OSS project, the intervention is evaluated through student feedback and planned analysis of code quality improvements. Findings suggest that LLMs can bridge theoretical and practical learning, supporting a deeper understanding of maintainability and refactoring principles. Despite the importance of refactoring, teaching effective techniques remains challenging, particularly when students encounter real-world, complex codebases rather than contrived examples [2]. Students often struggle with identifying refactoring opportunities in unfamiliar code and implementing appropriate transformations that preserve functionality while enhancing quality. Open Source Software (OSS) projects offer an authentic environment for students to practice refactoring skills.
- North America > United States > North Carolina > Wake County > Raleigh (0.05)
- Atlantic Ocean > North Atlantic Ocean > Baltic Sea (0.04)
- Asia > China > Hebei Province > Shijiazhuang (0.04)
Characterising Open Source Co-opetition in Company-hosted Open Source Software Projects: The Cases of PyTorch, TensorFlow, and Transformers
Osborne, Cailean, Daneshyan, Farbod, He, Runzhi, Ye, Hengzhi, Zhang, Yuxia, Zhou, Minghui
Companies, including market rivals, have long collaborated on the development of open source software (OSS), resulting in a tangle of co-operation and competition known as "open source co-opetition". While prior work investigates open source co-opetition in OSS projects that are hosted by vendor-neutral foundations, we have a limited understanding thereof in OSS projects that are hosted and governed by one company. Given their prevalence, it is timely to investigate open source co-opetition in such contexts. Towards this end, we conduct a mixed-methods analysis of three company-hosted OSS projects in the artificial intelligence (AI) industry: Meta's PyTorch (prior to its donation to the Linux Foundation), Google's TensorFlow, and Hugging Face's Transformers. We contribute three key findings. First, while the projects exhibit similar code authorship patterns between host and external companies (80%/20% of commits), collaborations are structured differently (e.g., decentralised vs. hub-and-spoke networks). Second, host and external companies engage in strategic, non-strategic, and contractual collaborations, with varying incentives and collaboration practices. Some of the observed collaborations are specific to the AI industry (e.g., hardware-software optimizations or AI model integrations), while others are typical of the broader software industry (e.g., bug fixing or task outsourcing). Third, single-vendor governance creates a power imbalance that influences open source co-opetition practices and possibilities, from the host company's singular decision-making power (e.g., the risk of license change) to their community involvement strategy (e.g., from over-control to over-delegation). We conclude with recommendations for future research.
- North America > United States > California > San Francisco County > San Francisco (0.14)
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)
- North America > United States > New York > New York County > New York City (0.04)
- (16 more...)
- Research Report > New Finding (0.68)
- Personal > Interview (0.46)
- Information Technology > Services (1.00)
- Law (0.93)
Why Companies "Democratise" Artificial Intelligence: The Case of Open Source Software Donations
Companies claim to "democratise" artificial intelligence (AI) when they donate AI open source software (OSS) to non-profit foundations or release AI models, among others, but what does this term mean and why do they do it? As the impact of AI on society and the economy grows, understanding the commercial incentives behind AI democratisation efforts is crucial for ensuring these efforts serve broader interests beyond commercial agendas. Towards this end, this study employs a mixed-methods approach to investigate commercial incentives for 43 AI OSS donations to the Linux Foundation. It makes contributions to both research and practice. It contributes a taxonomy of both individual and organisational social, economic, and technological incentives for AI democratisation. In particular, it highlights the role of democratising the governance and control rights of an OSS project (i.e., from one company to open governance) as a structural enabler for downstream goals, such as attracting external contributors, reducing development costs, and influencing industry standards, among others. Furthermore, OSS donations are often championed by individual developers within companies, highlighting the importance of the bottom-up incentives for AI democratisation. The taxonomy provides a framework and toolkit for discerning incentives for other AI democratisation efforts, such as the release of AI models. The paper concludes with a discussion of future research directions.
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.28)
- North America > United States > California > San Francisco County > San Francisco (0.14)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
- (23 more...)
- Research Report > Experimental Study (1.00)
- Questionnaire & Opinion Survey (1.00)
- Research Report > New Finding (0.93)
- Information Technology > Software (1.00)
- Information Technology > Services (1.00)
- Social Sector (0.88)
- (2 more...)
Public-private funding models in open source software development: A case study on scikit-learn
Governments are increasingly funding open source software (OSS) development to support software security, digital sovereignty, and national competitiveness in science and innovation, amongst others. However, little is known about how OSS developers evaluate the relative benefits and drawbacks of governmental funding for OSS. This study explores this question through a case study on scikit-learn, a Python library for machine learning, funded by public research grants, commercial sponsorship, micro-donations, and a 32 euro million grant announced in France's artificial intelligence strategy. Through 25 interviews with scikit-learn's maintainers and funders, this study makes two key contributions. First, it contributes empirical findings about the benefits and drawbacks of public and private funding in an impactful OSS project, and the governance protocols employed by the maintainers to balance the diverse interests of their community and funders. Second, it offers practical lessons on funding for OSS developers, governments, and companies based on the experience of scikit-learn. The paper concludes with key recommendations for practitioners and future research directions.
- Europe > France (0.67)
- North America > United States > California > San Francisco County > San Francisco (0.14)
- North America > United States > California > Los Angeles County > Los Angeles (0.14)
- (21 more...)
- Banking & Finance (1.00)
- Government > Regional Government > Europe Government (0.94)
- Information Technology > Security & Privacy (0.66)
Trust in Motion: Capturing Trust Ascendancy in Open-Source Projects using Hybrid AI
Sanchez, Huascar, Hitaj, Briland
Open-source is frequently described as a driver for unprecedented communication and collaboration, and the process works best when projects support teamwork. Yet, open-source cooperation processes in no way protect project contributors from considerations of trust, power, and influence. Indeed, achieving the level of trust necessary to contribute to a project and thus influence its direction is a constant process of change, and developers take many different routes over many communication channels to achieve it. We refer to this process of influence-seeking and trust-building as trust ascendancy. This paper describes a methodology for understanding the notion of trust ascendancy and introduces the capabilities that are needed to localize trust ascendancy operations happening over open-source projects. Much of the prior work in understanding trust in open-source software development has focused on a static view of the problem using different forms of quantity measures. However, trust ascendancy is not static, but rather adapts to changes in the open-source ecosystem in response to new input. This paper is the first attempt to articulate and study these signals from a dynamic view of the problem. In that respect, we identify related work that may help illuminate research challenges, implementation tradeoffs, and complementary solutions. Our preliminary results show the effectiveness of our method at capturing the trust ascendancy developed by individuals involved in a well-documented 2020 social engineering attack. Our future plans highlight research challenges and encourage cross-disciplinary collaboration to create more automated, accurate, and efficient ways to model and then track trust ascendancy in open-source projects.
- North America > United States > Minnesota (0.04)
- North America > United States > California > Santa Clara County > Santa Clara (0.04)
- Europe > Germany > Bavaria > Middle Franconia > Nuremberg (0.04)
- Information Technology > Security & Privacy (0.48)
- Government (0.47)
- Information Technology > Software (1.00)
- Information Technology > Artificial Intelligence > Machine Learning (0.94)
Predicting long-time contributors for GitHub projects using machine learning
Many organizations develop software systems using open source software (OSS), which is risky due to the high possibility of losing support. Contributors are critical for the survival of OSS projects, but very few new contributors remain with OSS projects to become long-time contributors (LTCs). Identification of factors that contribute to become an LTC can help OSS project owners utilize limited resources to retain new contributors. In this paper, we investigate whether we can effectively predict new contributors to OSS repositories becoming long time contributors based on repository and contributor meta-data collected from GitHub. We construct a dataset containing 70,899 observations from 888 most popular repositories with 56,766 contributors.
A Measurement of Social Capital in an Open Source Software Project
Alqithami, Saad, Alzahrani, Musaad, Alghamdi, Fahad, Budiarto, Rahmat, Hexmoor, Henry
The paper provides an understanding of social capital in organizations that are open membership multi-agent systems with an emphasis in our formulation on the dynamic network of social interaction that, in part, elucidate evolving structures and impromptu topologies of networks. This paper, therefore, models an open source project as an organizational network. It provides definitions of social capital for this organizational network and formulation of the mechanism to optimize the social capital for achieving its goal that is optimized productivity. A case study of an open source Apache-Hadoop project is considered and empirically evaluated. An analysis of how social capital can be created within this type of organizations and driven to a measurement for its value is provided. Finally, a verification on whether the social capital of the organizational network is proportional towards optimizing their productivity is considered.
- North America > United States > Illinois > Jackson County > Carbondale (0.04)
- North America > United States > District of Columbia > Washington (0.04)
- North America > United States > California > Los Angeles County > Beverly Hills (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Quality Classifiers for Open Source Software Repositories
Tsatsaronis, George, Halkidi, Maria, Giakoumakis, Emmanouel A.
Initial open source software (OSS) projects rely on large repositories for hosting and distribution until they become independent. A huge amount of project metadata is collected and maintained in such software repositories providing useful information about projects and their success. In this paper we propose a data mining approach that processes the metadata contained in such OSS repositories. The proposed approach aims at the construction of a classifier that is trained on the metadata of existing projects and predicts the successful continuation of any given OSS. The successfulness of a project is defined with regard to the confidence level of the classifier which predicts that this project will be ported in widely used OSS projects (e.g.
- North America > United States > Indiana > Porter County > Portage (0.04)
- Europe > France (0.04)